Fast Approximate Spoken Term Detection from Sequence of Phonemes
نویسندگان
چکیده
We investigate the detection of spoken terms in conversational speech using phoneme recognition with the objective of achieving smaller index size as well as faster search speed. Speech is processed and indexed as a sequence of one best phoneme sequence. We propose the use of a probabilistic pronunciation model for the search term to compensate for the errors in the recognition of phonemes. This model is derived using the pronunciation of the word and the phoneme confusion matrix. Experiments are performed on the conversational telephone speech database distributed by NIST for the 2006 spoken term detection. We achieve about 1500 times smaller index size and 14 times faster search speed compared to the state-of-the-art system using phoneme lattice at the cost of relatively lower detection performance.
منابع مشابه
Evaluation of Fast Spoken Term Detection Using a Suffix Array
We previously proposed [1] fast spoken term detection that uses a suffix array as a data structure for searching a largescale speech documents. In this method, a keyword is divided into sub-keywords, and the phoneme sequences that contain two or more sub-keywords are output as results. Although the search is executed very quickly on a 10,000-h speech database, we only proposed a variety of matc...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملMetric subspace indexing for fast spoken term detection
In this paper, we propose a novel indexing method for Spoken Term Detection (STD). The proposed method can be considered as using metric space indexing for the approximate stringmatching problem, where the distance between a phoneme and a position in the target spoken document is defined. The proposed method does not require the use of thresholds to limit the output, instead being able to outpu...
متن کاملYLAB@RU at Spoken Term Detection Task in NTCIR-10 SpokenDoc-2
The development of spoken term detection (STD) techniques, which detect a given word or phrase from spoken documents, is widely conducted in order to realize easy access to large amount of multimedia contents including speech This paper describes improvement of the STD method which is based on the vector quantization (VQ) and has been proposed in NTCIR-9 SpokenDoc. Spoken documents are represen...
متن کاملFast subword-based approach for open vocabulary spoken term detection
This paper describes an efficient two-stage approach using sub-phonetic segment N-gram index and shift continuous dynamic programming for open vocabulary spoken term detection. With this two-stage search, we attempt to improve performance in both retrieval accuracy and process time. In the speech recognition process, a more sophisticated subword that is shorter than phonemes is used to minimize...
متن کامل